20 research outputs found

    Multimodal Uncertainty Reduction for Intention Recognition in Human-Robot Interaction

    Full text link
    Assistive robots can potentially improve the quality of life and personal independence of elderly people by supporting everyday life activities. To guarantee a safe and intuitive interaction between human and robot, human intentions need to be recognized automatically. As humans communicate their intentions multimodally, the use of multiple modalities for intention recognition may not just increase the robustness against failure of individual modalities but especially reduce the uncertainty about the intention to be predicted. This is desirable as particularly in direct interaction between robots and potentially vulnerable humans a minimal uncertainty about the situation as well as knowledge about this actual uncertainty is necessary. Thus, in contrast to existing methods, in this work a new approach for multimodal intention recognition is introduced that focuses on uncertainty reduction through classifier fusion. For the four considered modalities speech, gestures, gaze directions and scene objects individual intention classifiers are trained, all of which output a probability distribution over all possible intentions. By combining these output distributions using the Bayesian method Independent Opinion Pool the uncertainty about the intention to be recognized can be decreased. The approach is evaluated in a collaborative human-robot interaction task with a 7-DoF robot arm. The results show that fused classifiers which combine multiple modalities outperform the respective individual base classifiers with respect to increased accuracy, robustness, and reduced uncertainty.Comment: Submitted to IROS 201

    MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot Interaction

    Full text link
    Modeling interaction dynamics to generate robot trajectories that enable a robot to adapt and react to a human's actions and intentions is critical for efficient and effective collaborative Human-Robot Interactions (HRI). Learning from Demonstration (LfD) methods from Human-Human Interactions (HHI) have shown promising results, especially when coupled with representation learning techniques. However, such methods for learning HRI either do not scale well to high dimensional data or cannot accurately adapt to changing via-poses of the interacting partner. We propose Multimodal Interactive Latent Dynamics (MILD), a method that couples deep representation learning and probabilistic machine learning to address the problem of two-party physical HRIs. We learn the interaction dynamics from demonstrations, using Hidden Semi-Markov Models (HSMMs) to model the joint distribution of the interacting agents in the latent space of a Variational Autoencoder (VAE). Our experimental evaluations for learning HRI from HHI demonstrations show that MILD effectively captures the multimodality in the latent representations of HRI tasks, allowing us to decode the varying dynamics occurring in such tasks. Compared to related work, MILD generates more accurate trajectories for the controlled agent (robot) when conditioned on the observed agent's (human) trajectory. Notably, MILD can learn directly from camera-based pose estimations to generate trajectories, which we then map to a humanoid robot without the need for any additional training.Comment: Accepted at the IEEE-RAS International Conference on Humanoid Robots (Humanoids) 202

    ExGenNet: Learning to Generate Robotic Facial Expression Using Facial Expression Recognition

    Get PDF
    The ability of a robot to generate appropriate facial expressions is a key aspect of perceived sociability in human-robot interaction. Yet many existing approaches rely on the use of a set of fixed, preprogrammed joint configurations for expression generation. Automating this process provides potential advantages to scale better to different robot types and various expressions. To this end, we introduce ExGenNet, a novel deep generative approach for facial expressions on humanoid robots. ExGenNets connect a generator network to reconstruct simplified facial images from robot joint configurations with a classifier network for state-of-the-art facial expression recognition. The robots' joint configurations are optimized for various expressions by backpropagating the loss between the predicted expression and intended expression through the classification network and the generator network. To improve the transfer between human training images and images of different robots, we propose to use extracted features in the classifier as well as in the generator network. Unlike most studies on facial expression generation, ExGenNets can produce multiple configurations for each facial expression and be transferred between robots. Experimental evaluations on two robots with highly human-like faces, Alfie (Furhat Robot) and the android robot Elenoide, show that ExGenNet can successfully generate sets of joint configurations for predefined facial expressions on both robots. This ability of ExGenNet to generate realistic facial expressions was further validated in a pilot study where the majority of human subjects could accurately recognize most of the generated facial expressions on both the robots

    Learning Coupled Forward-Inverse Models with Combined Prediction Errors

    Get PDF
    Challenging tasks in unstructured environments require robots to learn complex models. Given a large amount of information, learning multiple simple models can offer an efficient alternative to a monolithic complex network. Training multiple models-that is, learning their parameters and their responsibilities-has been shown to be prohibitively hard as optimization is prone to local minima. To efficiently learn multiple models for different contexts, we thus develop a new algorithm based on expectation maximization (EM). In contrast to comparable concepts, this algorithm trains multiple modules of paired forward-inverse models by using the prediction errors of both forward and inverse models simultaneously. In particular, we show that our method yields a substantial improvement over only considering the errors of the forward models on tasks where the inverse space contains multiple solutions

    Multimodal Uncertainty Reduction for Intention Recognition in Human-Robot Interaction

    Get PDF
    Assistive robots can potentially improve the quality of life and personal independence of elderly people by supporting everyday life activities. To guarantee a safe and intuitive interaction between human and robot, human intentions need to be recognized automatically. As humans communicate their intentions multimodally, the use of multiple modalities for intention recognition may not just increase the robustness against failure of individual modalities but especially reduce the uncertainty about the intention to be recognized. This is desirable as particularly in direct interaction between robots and potentially vulnerable humans a minimal uncertainty about the situation as well as knowledge about this actual uncertainty is necessary. Thus, in contrast to existing methods, in this work a new approach for multimodal intention recognition is introduced that focuses on uncertainty reduction through classifier fusion. For the four considered modalities speech, gestures, gaze directions and scene objects individual intention classifiers are trained, all of which output a probability distribution over all possible intentions. By combining these output distributions using the Bayesian method Independent Opinion Pool [1] the uncertainty about the intention to be recognized can be decreased. The approach is evaluated in a collaborative human-robot interaction task with a 7-DoF robot arm. The results show that fused classifiers, which combine multiple modalities, outperform the respective individual base classifiers with respect to increased accuracy, robustness, and reduced uncertainty

    Demonstration based trajectory optimization for generalizable robot motions

    Get PDF
    Learning motions from human demonstrations provides an intuitive way for non-expert users to teach tasks to robots. In particular, intelligent robotic co-workers should not only mimic human demonstrations but should also be able to adapt them to varying application scenarios. As such, robots must have the ability to generalize motions to different workspaces, e.g. to avoid obstacles not present during original demonstrations. Towards this goal our work proposes a unified method to (1) generalize robot motions to different workspaces, using a novel formulation of trajectory optimization that explicitly incorporates human demonstrations, and (2) to locally adapt and reuse the optimized solution in the form of a distribution of trajectories. This optimized distribution can be used, online, to quickly satisfy via-points and goals of a specific task. We validate the method using a 7 degrees of freedom (DoF) lightweight arm that grasps and places a ball into different boxes while avoiding obstacles that were not present during the original human demonstrations

    Online Learning of an Open-Ended Skill Library for Collaborative Tasks

    Get PDF
    Intelligent robotic assistants can potentially improve the quality of life for elderly people and help them maintain their independence. However, the number of different and personalized tasks render pre-programming of such assistive robots prohibitively difficult. Instead, to cope with a continuous and open-ended stream of cooperative tasks, new collaborative skills need to be continuously learned and updated from demonstrations. To this end, we introduce an online learning method for a skill library of collaborative tasks that employs an incremental mixture model of probabilistic interaction primitives. This model chooses a corresponding robot response to a human movement where the human intention is extracted from previously demonstrated movements. Unlike existing batch methods of movement primitives for human-robot interaction, our approach builds a library of skills online, in an open-ended fashion and updates existing skills using new demonstrations. The resulting approach was evaluated both on a simple benchmark task and in an assistive human-robot collaboration scenario with a 7DoF robot arm

    Learning Intention Aware Online Adaptation of Movement Primitives

    Get PDF
    In order to operate close to non-experts, future robots require both an intuitive form of instruction accessible to laymen and the ability to react appropriately to a human co-worker. Instruction by imitation learning with probabilistic movement primitives (ProMPs) allows capturing tasks by learning robot trajectories from demonstrations, including the motion variability. However, appropriate responses to human co-workers during the execution of the learned movements are crucial for fluent task execution, perceived safety, and subjective comfort. To facilitate such appropriate responsive behaviors in human-robot interaction, the robot needs to be able to react to its human workspace co-inhabitant online during the execution of the ProMPs. Thus, we learn a goal-based intention prediction model from human motions. Using this probabilistic model, we introduce intention-aware online adaptation to ProMPs. We compare two different novel approaches: First, online spatial deformation, which avoids collisions by changing the shape of the ProMP trajectories dynamically during execution while staying close to the demonstrated motions and second, online temporal scaling, which adapts the velocity profile of a ProMP to avoid time-dependent collisions. We evaluate both approaches in experiments with non-expert users. The subjects reported a higher level of perceived safety and felt less disturbed during intention aware adaptation, in particular during spatial deformation, compared to non-adaptive behavior of the robot

    ExGenNet: Learning to Generate Robotic Facial Expression Using Facial Expression Recognition

    Get PDF
    The ability of a robot to generate appropriate facial expressions is a key aspect of perceived sociability in human-robot interaction. Yet many existing approaches rely on the use of a set of fixed, preprogrammed joint configurations for expression generation. Automating this process provides potential advantages to scale better to different robot types and various expressions. To this end, we introduce ExGenNet, a novel deep generative approach for facial expressions on humanoid robots. ExGenNets connect a generator network to reconstruct simplified facial images from robot joint configurations with a classifier network for state-of-the-art facial expression recognition. The robots’ joint configurations are optimized for various expressions by backpropagating the loss between the predicted expression and intended expression through the classification network and the generator network. To improve the transfer between human training images and images of different robots, we propose to use extracted features in the classifier as well as in the generator network. Unlike most studies on facial expression generation, ExGenNets can produce multiple configurations for each facial expression and be transferred between robots. Experimental evaluations on two robots with highly human-like faces, Alfie (Furhat Robot) and the android robot Elenoide, show that ExGenNet can successfully generate sets of joint configurations for predefined facial expressions on both robots. This ability of ExGenNet to generate realistic facial expressions was further validated in a pilot study where the majority of human subjects could accurately recognize most of the generated facial expressions on both the robots

    Interactive Machine Learning for Assistive Robots

    Get PDF
    Intelligent assistive robots can potentially support elderly persons and caregivers in their everyday lives and facilitate a closer man and machine collaboration as an essential part of the yet to come 5-th industrial revolution. In contrast to classical robotic applications where robots were mostly designed for repetitive tasks, assistive robots will face a variety of different tasks in close contact with everyday users. In particular, it is difficult to foresee the variety of applications beforehand since they depend on a person's individual needs and preferences. This renders preprogramming of all tasks for assistive robots difficult and gives need to explore methods of how robots can learn new tasks at hand during deployment time. Learning from and during direct interaction with humans provides hereby a potentially powerful tool for an assistive robot to acquire new skills and incorporate prior human knowledge during the exploration of novel tasks. Such an interactive learning process can not only help the robot to acquire new skills or profit from human prior knowledge but also facilitates the participation of inexperienced users or coworkers which can lead to a higher acceptance of the robot. However, while on the one hand human presence and assistance can be beneficial during the learning process, on the other hand, close contact with inexperienced users also imposes challenges. In shared workspaces or in close contact with everyday users a robot should be able to adapt learned skills to achieve as little disturbance of humans as possible. It becomes also important to evaluate human preferences about such adaptation strategies, their understanding of interactive learning processes and different ways for human input into learning. To come closer to the goal of intelligent assistive robots is therefore important to develop novel interactive learning methods and evaluate them in different robotic applications. This thesis focusses on three main challenges related to the development of assistive intelligent robots and their interaction with everyday users. The different parts of the thesis contribute not only novel theoretical methods but additionally also evaluations on different robotic tasks with users, that had zero or only little prior experience with robots. The first challenge is to enable robots to learn cooperative skills from a potentially open-ended stream of human demonstrations in an incremental fashion. While learning new skills from human demonstrations has already been exploited in the literature it remains challenging to learn skill libraries from incrementally incoming demonstrations and when the total number of skills is not known beforehand. Therefore, in the first part of the thesis, we introduce an approach for online and incremental learning of a library for collaborative skills. Here, we follow a Mixture of Experts based approach and incrementally learn a library of collaborative skills and a gating model from coupled human-robot trajectories. Once trained, the gating model can decide which skill to choose as an appropriate response to a human motion, based on prior demonstrations and activate the corresponding robot skill. In contrast to existing batch learning methods, our method does not require the total number of skills to be known a priori and can learn new skills as well as update existing skills from multiple human demonstrations. The cooperative skills are represented as Probabilistic Interaction Primitives which can capture variance and inherent correlations in the demonstrations. We evaluate our method with different human subjects in a task where a robot assists the subjects in making a salad. We also evaluate hereby how learned skills transfer between different subjects. Second, intelligent assistive robots should be able to adapt learned skills to humans when working in close contact or shared workspaces. For Probabilistic Movement Primitives (ProMPs), which were chosen as a skill representation in this thesis, such methods for online adaptation were missing in the literature so far. Hereby, it is in particular important to also evaluate the perceived level of safety and comfort of humans according to different adaptation strategies. To this end, we present two methods for online adaptation of learned skills in a shared workspace setting. Here, we introduce two novel online adaptation methods for ProMPs, namely spatial deformation and temporal scaling. Spatial deformation avoids collisions by dynamically changing the shape of the movement primitive, while at the same time staying close to the demonstrated motions. In temporal scaling, we adapt the ProMP's velocity profile to avoid time-dependent collisions. To achieve intention aware adaptation in shared workspaces we combine both methods with a goal-directed prediction model for human motions. This prediction model can also be learned online from human motions. We conducted experiments for both novel adaptation methods in comparison to non-adaptive behavior with inexperienced users and evaluated influences on task performance as well as subjective metrics such as comfort and perceived level of safety. The third challenge that we consider in this thesis is how a library of learned skills can be used in practice to solve sequential robotic tasks. While hereby reinforcement learning offers a powerful tool for reward-driven learning and self-improvement, in real robotic applications it often suffers from costly and time-consuming sample collection. Here, human input might be beneficial to speed up and guide the learning. Therefore, it is important to enable and compare different ways how human input can be incorporated in reinforcement learning algorithms. In this thesis, we present an approach, which incorporates multiple forms of human input into reinforcement learning for sequential tasks. Since depending on the task human input might not always be correct, we additionally introduce the concept of self-confidence for the robot, such that it becomes able to question human input. We evaluate which input channels humans prefer during interaction and how well they accept suggestions or rejections of the robot if the robot becomes confident in its own decisions. To summarize, the different parts of the thesis contribute to the development of intelligent assistive robots that can learn from imitating humans, adapt the learned skills dynamically to humans in shared workspaces and profit and learn from human input during self-driven learning of how to sequence skills into more complex tasks. The three main contributions to the state of the art are hereby: First, a novel approach to incrementally learn a library for collaborative skills when the total number of skills is not known a priori. Second, two novel methods for online adaptation of ProMPs and their combination with a goal-directed prediction model to enable intention aware online adaptation in shared workspaces. And third, an approach that combines multiple forms of human input with a reinforcement learning algorithm and a novel concept of self-confidence to learn and improve the sequencing of skills into more complex tasks
    corecore